fix(studio): filter runtime-generated nodes from resolver-shadow telemetry#1795
Conversation
…metry The sdk_resolver_shadow tripwire flagged element_not_found for nodes a composition <script> creates at runtime (caption word/group spans, etc.). These have no static data-hf-id, so the SDK session (a static parse) cannot model them by design; the divergence is noise, not a resolver bug. - Runtime-node filter: suppress element_not_found when the resolved hf-id is absent from the on-disk source. An id PRESENT in source but missing from the session stays flagged (the genuine v0.6.110-class resolver divergence). - Add sessionElementCount to all element_not_found / animation_not_found emits (0 = empty/broken session, >0 = element-specific). - Add sourceHfIdCount to emitted element_not_found: =1 = static node the parse dropped (foreign-content exclusion / sub-comp gap), >1 = duplicate-id resolver ambiguity. Scoped to the DOM-edit path. Telemetry-only; no disk writes, no edit change. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This stack of pull requests is managed by Graphite. Learn more about stacking. |
miga-heygen
left a comment
There was a problem hiding this comment.
Solid, well-scoped PR. The runtime-node filter logic is sound, the telemetry metadata enrichments are useful for triaging the remaining buckets, and the test coverage hits the important cases. CI is green on all required checks. A few observations:
1. includes vs countHfIdInSource semantic gap (minor — metadata inconsistency, not a bug)
The filter gate at L180 uses sourceContent.includes(hfId) (raw substring), while countHfIdInSource uses the attribute-scoped data-hf-id="${id}". For an id like hf-a where source only contains data-hf-id="hf-abc", includes("hf-a") returns true (correctly keeps the signal — no false suppression), but countHfIdInSource("...", "hf-a") returns 0. The emitted event would carry sourceHfIdCount: 0 for an element_not_found that survived the filter — a "present in source but zero attribute matches" state that could confuse triage dashboards. The bias direction is correct (overshoot toward keeping signal), but making the filter gate use countHfIdInSource(sourceContent, hfId) > 0 instead of includes would keep both checks on the same semantic level and avoid the zero-count anomaly. Not blocking — the noise budget on this edge is tiny (hf-ids with shared prefixes are uncommon) and the "ponytail" comment documents the intent.
2. sessionElementCount double-call in runResolverShadow
sdkResolverShadowCheck already calls resolveSnapshot → session.getElements() internally, and then runResolverShadow calls session.getElements().length again for the telemetry payload. Since getElements() returns a fresh snapshot array each time, this is a second full traversal. For the telemetry-only hot path on every style/text/attr edit, the double call is worth noting. Not blocking since the check only reaches telemetry on divergence (early return on mismatches.length === 0), so the second call fires only on the already-rare emit path.
3. Test coverage
The three new C8 cases cover the filter logic well (suppressed / kept / count). One thing that would strengthen the suite: a test for the includes vs countHfIdInSource gap itself — e.g. id "hf-a" against source data-hf-id="hf-abc" — to document the expected behavior when the loose match hits but the tight count doesn't. Not blocking; flagging it as a future hardening opportunity.
4. Everything else looks clean
countHfIdInSourcesplit technique is correct (no regex escaping needed for hf-ids).sourceContent?: stringoptional parameter is backward compatible.- Existing redaction untouched — no PII risk from the new count fields.
- The intentional exclusion of
recordResolverParity/recordAnimationResolverParityfrom the runtime-node filter is well-documented in the PR body. recordAnimationResolverParitycachessession.getElements()to avoid the double-call — nice touch.
Overall: clean telemetry-only change with correct bias direction. LGTM with the minor notes above.
— Miga
miguel-heygen
left a comment
There was a problem hiding this comment.
Reviewed 10743d85.
Verdict: APPROVE.
The runtime-node filter is directionally correct and scoped to resolver-shadow telemetry only: when Studio resolves a live DOM hf-id that is absent from the on-disk source, the event is suppressed because the SDK static session cannot model script-created runtime nodes by design. Static ids that are present in source but missing from the SDK session still emit, preserving the actionable v0.6.110-class resolver-divergence signal.
Verified:
useDomEditSessionnow passesoriginalContentonly into the shadow check path; edit persistence behavior is unchanged.sdkResolverShadowChecksuppresses absent-from-source element_not_found cases and keeps present-in-source missing-session cases.sourceHfIdCountis telemetry metadata only; source content itself is not emitted.sessionElementCountis added to element/animation not-found telemetry, which should make empty-session vs element-specific buckets easier to triage.- Tests cover the important suppression/keep/count cases.
Non-blocking: I agree with Miga that the sourceContent.includes(hfId) gate and countHfIdInSource() use slightly different semantics. A shared countHfIdInSource(sourceContent, hfId) > 0 gate would avoid rare substring-prefix anomalies (hf-a vs hf-abc) where an event survives with sourceHfIdCount: 0. The current bias is safe because it keeps extra signal instead of suppressing a real bug.
CI has no failures at review time; Windows render/test jobs are still pending.
— Magi
james-russo-rames-d-jusso
left a comment
There was a problem hiding this comment.
Reviewed 10743d85 — layering on Miga's COMMENTED and Magi's APPROVED reviews from this morning. ✅ LGTM (telemetry-only, scoped, correct bias direction).
Summary
Adds a runtime-node filter (sourceContent.includes(hfId) gate) to suppress noisy element_not_found for script-created nodes, plus sessionElementCount on all emit sites and sourceHfIdCount on the DOM-edit path's element_not_found events. Filter live only on runResolverShadow (DOM-edit path) — the element-targeted cutover chokepoints (recordResolverParity, recordAnimationResolverParity) stay unguarded, which the PR body documents as deliberate (no sync source access; low-volume).
Findings
🟡 Dashboard-consumer impact of the includes vs countHfIdInSource gap (extending Miga's finding 1)
Miga's note about the semantic gap is mechanically correct (verified — "data-hf-id=\"hf-abc\"".includes("hf-a") is true; countHfIdInSource(src, "hf-a") returns 0). The downstream consequence is worth surfacing explicitly: the PR body and the comment at sdkResolverShadow.ts:277-281 document sourceHfIdCount as a {=1: single-static-parse-drop, >1: duplicate-id} two-bucket signal, but the gap introduces a third bucket — =0 — for ids that pass the loose includes gate as a longer-id prefix. Anyone querying PostHog with sourceHfIdCount == 1 to size the "parser dropped a static node" cohort will miss those events, and sourceHfIdCount > 0 won't catch them either.
Two cheap fixes, take either:
- Tighten the gate to
countHfIdInSource(sourceContent, hfId) > 0(Miga's suggestion) — keeps the two-bucket invariant by construction. - Keep the loose gate, but document
=0as the third explicit bucket ("loose-prefix match — same disposition as runtime-only; suppress at dashboard layer") and add a regression test pinning the behavior. The "ponytail" comment already nods at this; just make the dashboard-side intent explicit.
I'd take (1) — the cost is one extra split-pair on the hot-path early return, and it preserves the documented mental model. But not blocking.
🟡 Test for the includes/count boundary (seconding Miga's #3)
A test like the one Miga sketched — id "hf-a" against source data-hf-id="hf-abc" — would pin the current behavior either way (whether you take fix #1 or fix #2 above). The C8 cases as-written cover the canonical paths (runtime-only / static / dup) but don't cover the prefix-overlap corner that's now load-bearing for the filter's correctness.
🟡 runResolverShadow failure-path coverage (per the observability rubric)
The new metadata is only emitted on the element_not_found mismatch branch (correct). But the existing dispatch_error branch at sdkResolverShadow.ts:214 returns a mismatch without the runtime-node filter being consulted (the filter only sits in the unresolved-snapshot early return at L177-187), so a dispatch_error on an actually-runtime-only hfId would still emit. Not a new gap — same behavior as before this PR — but the PR's stated goal is "reduce noise on the DOM-edit path." If captions.html-class runtime ids ever reach the dispatch path (i.e. resolveSnapshot succeeds via the bare-match fallback at L86-89), they'd skip the new filter. Worth a follow-up grep to confirm the bare-match fallback never resolves a runtime-only id to a real session element. If it does, that's a different noise bucket the filter doesn't catch.
✅ Verified clean
originalContentatuseDomEditCommits.ts:153-157is the full file body (read via/api/projects/.../files/<targetPath>), so theincludessubstring search has the right surface — it's not scoped to a selection subtree.trackStudioEvent(studioTelemetry.ts:69-77) is synchronous queue-push; noimport().thenasync-import race for early-call telemetry loss (one of the failure-modes I always check on telemetry PRs — clean here).- PostHog payload values are bounded:
sessionElementCountis an int,sourceHfIdCountis an int, no user content. No cardinality / PII concern from the new fields. ExistingredactMismatchesredaction ofexpected/actualis untouched. recordAnimationResolverParitycache ofsession.getElements()(L344) is a nice incidental cleanup; matches Miga's note.- The intentional exclusion of
recordResolverParity/recordAnimationResolverParityfrom the source-filter is documented in the PR body ("low-volume; left unguarded") and the source-access constraint is real —sdkTimingPersist/sdkGsapTweenPersistdon't read the file.
✅ Sibling-asymmetry check (clean)
The three trackStudioEvent("sdk_resolver_shadow", ...) emit sites at L272 / L312 / L347 are intentionally asymmetric on sourceHfIdCount (DOM-edit only) and intentionally symmetric on sessionElementCount (all three). Matches the PR-body intent. The redaction shape is identical across all three.
Verdict
LGTM. The includes vs count gap (Miga's #1) is the most actionable polish; my dashboard-bucket reframe makes the "fix vs document" tradeoff explicit. None of it blocks merge — the bias direction is correct (keep signal when in doubt), and the documented goal of cutting captions.html-class noise is preserved.
— Rames D Jusso

What
Reduce noise in the
studio:sdk_resolver_shadowtelemetry tripwire and add metadata to triage what remains. Telemetry-only — no disk writes, no change to user-visible edits.Why
The resolver-shadow tripwire (PR #1547) emits
element_not_foundwhenever the SDK session can't resolve an hf-id the server patch path targets. PostHog (SDK Resolver Shadow Parity) showed the dominant live class is runtime-generated nodes — elements a composition<script>creates at runtime (caption word/group spans, etc.). These have no staticdata-hf-id, so the SDK session (a static parse) cannot model them by design. They are noise, not a resolver bug, and they were drowning the genuine v0.6.110-class divergences the tripwire exists to catch.Investigation (last 14d of events) ruled out the alternatives:
comp == selFilein 39/40 top rows.hf-0ytcetc.) are all pre-fix0.6.111noise.captions.html .captions-container, traces topackages/studio/src/captions/generator.tsbuilding word/group nodes viacreateElementat runtime.Changes
sdkResolverShadowCheck/runResolverShadow): suppresselement_not_foundwhen the resolved hf-id is absent from the on-disk source. An id present in source but missing from the session stays flagged — that's the real resolver divergence. The "is it in source" test is the independent signal; "is it in the session" is circular (that's what the tripwire measures). Scoped to the DOM-edit path, where every observed runtime-node divergence lives.sessionElementCounton allelement_not_found/animation_not_foundemits:0= empty/broken session (actionable),>0= element-specific.sourceHfIdCounton emittedelement_not_found:=1= a single static node the parse dropped (foreign-content exclusion / sub-comp inlining gap),>1= duplicate ids → resolver picked the wrong instance. Forks the one remaining unexplained bucket (#soe-comment-style template ids) directly from telemetry.Test
packages/studio/src/utils/sdkResolverShadow.test.ts— added cases: absent-from-source → suppressed; present-in-source-but-missing → still flagged; duplicate-id →sourceHfIdCount: 2. 32 pass under vitest.Not in scope
sourceHfIdCountsplit from a day of telemetry first.setTiming/removeElement/reorder/addGsapTween) have no sync source access and are low-volume; left unguarded.🤖 Generated with Claude Code